Let's say we want to prepare data and try some scalers and classifiers for prediction in a classification problem. We will tune paramaters of classifiers by grid search technique.

Data preparing:


In [1]:
from sklearn.datasets import make_classification


X, y = make_classification()

Setting steps for our pipelines and parameters for grid search:


In [2]:
from reskit.core import Pipeliner

from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler

from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC


classifiers = [('LR', LogisticRegression()),
               ('SVC', SVC())]

scalers = [('standard', StandardScaler()),
           ('minmax', MinMaxScaler())]

steps = [('scaler', scalers),
         ('classifier', classifiers)]

param_grid = {'LR': {'penalty': ['l1', 'l2']},
              'SVC': {'kernel': ['linear', 'poly', 'rbf', 'sigmoid']}}

Setting a cross-validation for grid searching of hyperparameters and for evaluation of models with obtained hyperparameters.


In [3]:
from sklearn.model_selection import StratifiedKFold


grid_cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=0)
eval_cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=1)

Creating a plan of our research:


In [4]:
pipe = Pipeliner(steps=steps, grid_cv=grid_cv, eval_cv=eval_cv, param_grid=param_grid)
pipe.plan_table


Out[4]:
scaler classifier
0 standard LR
1 standard SVC
2 minmax LR
3 minmax SVC

To tune parameters of models and evaluate this models, run:


In [5]:
pipe.get_results(X, y, scoring=['roc_auc'])


Line: 1/4
Line: 2/4
Line: 3/4
Line: 4/4
Out[5]:
scaler classifier grid_roc_auc_mean grid_roc_auc_std grid_roc_auc_best_params eval_roc_auc_mean eval_roc_auc_std eval_roc_auc_scores
0 standard LR 0.956 0.0338231 {'penalty': 'l1'} 0.968 0.0324962 [ 0.92 1. 1. 0.94 0.98]
1 standard SVC 0.962 0.0278568 {'kernel': 'poly'} 0.976 0.0300666 [ 0.95 1. 1. 0.93 1. ]
2 minmax LR 0.964 0.0412795 {'penalty': 'l1'} 0.966 0.0377359 [ 0.92 1. 1. 0.92 0.99]
3 minmax SVC 0.958 0.0411825 {'kernel': 'rbf'} 0.962 0.0401995 [ 0.93 1. 1. 0.9 0.98]